Semantic Context Detection Using Audio Event Fusion: Camera-Ready Version

نویسندگان

  • Wei-Ta Chu
  • Wen-Huang Cheng
چکیده

Semantic-level content analysis is a crucial issue in achieving efficient content retrieval andmanagement.We propose a hierarchical approach that models audio events over a time series in order to accomplish semantic context detection. Two levels of modeling, audio event and semantic context modeling, are devised to bridge the gap between physical audio features and semantic concepts. In this work, hidden Markov models (HMMs) are used to model four representative audio events, that is, gunshot, explosion, engine, and car braking, in action movies. At the semantic context level, generative (ergodic hidden Markov model) and discriminative (support vector machine (SVM)) approaches are investigated to fuse the characteristics and correlations among audio events, which provide cues for detecting gunplay and car-chasing scenes. The experimental results demonstrate the effectiveness of the proposed approaches and provide a preliminary framework for information mining by using audio characteristics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semantic Context Detection Using Audio Event Fusion

Semantic-level content analysis is a crucial issue in achieving efficient content retrieval andmanagement.We propose a hierarchical approach that models audio events over a time series in order to accomplish semantic context detection. Two levels of modeling, audio event and semantic context modeling, are devised to bridge the gap between physical audio features and semantic concepts. In this w...

متن کامل

Multimodal Information Fusion for Semantic Video Analysis

Multimedia data by its very nature contains multimodal information in it. For a successful analysis of multimedia content, all available multimodal information should be utilized. Additionally, since concepts can contain valuable cues about other concepts, concept interaction is a crucial source of multimedia information and helps to increase the fusion performance. The aim of this study is to ...

متن کامل

Non - Speech Acoustic Event Detection Using

Non-speech acoustic event detection (AED) aims to recognize events that are relevant to human activities associated with audio information. Much previous research has been focused on restricted highlight events, and highly relied on ad-hoc detectors for these events. This thesis focuses on using multimodal data in order to make non-speech acoustic event detection and classification tasks more r...

متن کامل

Event Detection in Basketball Video Using Multiple Modalities

Semantic sports video analysis has attracted more and more attention recently. In this paper, we present a basketball event detection method by using multiple modalities. Instead of using low-level features, the proposed method is built upon visual and auditory midlevel features i.e. semantic shot classes and audio keywords. Promising event detection results have been achieved. By heuristically...

متن کامل

IRIT @ TRECVid 2010 : Hidden Markov Models for Context-aware Late Fusion of Multiple Audio Classifiers

This notebook paper describes the four runs submitted by IRIT at TRECVid 2010 Semantic Indexing task. The four submitted runs can be described and compared as follows: • Run 4 – late fusion (weighted sum) of multiple audio-only classifiers output • Run 3 – context-aware re-rank of run 4 using hidden Markov model • Run 2 – context-aware late fusion of multiple audio classifiers output with hidde...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006